Abstract
Background: Pakistan represents one of the highest global β-thalassemia burdens, bearing an estimated carrier frequency of 5-7% and over 100,000 affected individuals (WHO, 2023). However, comprehensive molecular characterization remains limited. The confluence of high consanguinity rates and historical malaria-driven founder effects creates distinct genetic architectures demanding population-specific therapeutic approaches.
Methods: In a prospective hospital-based cohort study (2021–2025), we analyzed collected samples across multiplecentres in Pakistan. Comprehensive HBB gene sequencing was performed through NGS with >100x coverage, followed by variant calling GATK pipeline and classification by zygosity. All variants and novel compound heterozygous combinations were cross-referenced against ClinVar, HGMD, and HbVar databases.
Results: Analysis revealed that 1,269 (59.7%) were homozygous, 428 (20.2%) compound heterozygous, and 427 (20.1%) heterozygous. There was balanced gender distribution (49.3% male vs 50.7% female), consistent with autosomal recessive inheritance (p<0.001). The prevalence of two predominant variants: HBB: c.92+5G>C (553 patients, 43% of homozygous) and HBB: c.27dupG (291 patients, 23% of homozygous), accounted for 66.5% of severe cases. Among 84 unique variants in total, with remarkable genetic stratification, the top 5 variants comprised of 82.4% homozygous cases (95% CI: 80.1–84.7%), indicating strong founder effects. We also identified 13 novel compound heterozygous combinations not previously documented, involving common variants in new allelic combinations discovered between 2021-2024. These novel combinations spanning frameshift-splicing (c.25_26delAA~c.92+5G>C), missense-frameshift (c.79G>A~c.92+5G>C), one unique complex tri-allelic patterns (c.16C>T+c.15T>A+c.19G>T), and dual splicing variants (c.93-3T>G~c.92+5G>C), expanding the known mutation spectrum. The common variants c.92+5G>C and c.27dupG appeared in 5 and 3 novel combinations, respectively, demonstrating how population-specific allelic arrangements create unprecedented genetic diversity.
Conclusions: Our study demonstrates a distinctive Pakistani β-thalassemia genetic signature. The dominance of five variants (>80%) of severe disease creates unprecedented opportunities for precision diagnostics and gene-targeted therapies. Moreover, the discovery of 13 novel combinations of compound heterozygous expands the global mutation spectrum and provides critical insights for population-specific treatment strategies.